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Separation of the B component of a cosmic microwave background (CMB) polarization map from 
the much larger E component is an essential step in CMB polarimetry. For a map with incomplete 
sky coverage, this separation is necessarily hampered by the presence of "ambiguous" modes which 
could be either E or B modes. I present an efficient pixel-space algorithm for removing the ambiguous 
modes and separating the map into "pure" E and B components. The method, which works for 
arbitrary geometries, does not involve generating a complete basis of such modes and scales the 
cube of the number of pixels on the boundary of the map. 



I. INTRODUCTION 

A great deal of attention in cosmology is focused on 
attempts to characterize the polarization of the cosmic 
microwave background (CMB) radiation. Multiple ex- 
periments have detected CMB polarization [J-Q- The 
Planck Surveyor [3] will provide all-sky polarization maps 
at many frequencies, and many other experiments are in 
development. Many in the cosmology community regard 
the quest for high-quality CMB polarization maps as an 
extremely high priority [1, Q . CMB polarization maps 
can potentially provide a great deal of information about 
our Universe, complementing the wealth of information 
provided by CMB temperature anisotropy. 

Although polarization maps have multiple uses, much 
of the interest in CMB polarization stems from the pre- 
diction that a stochastic background of gravitational 
waves, prouced during an inflationary epoch, may be vis- 
ible in polarization maps. Detection of this background 
would be of revolutionary importance, providing direct 
confirmation of inflation as well as a measurement of the 
inflation energy scale. 

The possibility of detecting this gravitational wave 
background is a consequence of the fact that a CMB 
polarization map can be regarded as a sum of two com- 
ponents, a scalar E component and a pseudoscalar B 
component 

[IMl- To linear order in perturbation the- 
ory, "ordinary" scalar perturbations populate only the 
E component. The tensorial gravitational waves, on the 
other hand, populate both E and B components. Be- 
cause the tensor component is known to be smaller in 
amplitude than the scalar component, detection of it re- 
quires a channel that is free of scalar contributions. B- 
component polarization may provide this channel. 

The B component is predicted to be at least an order 
of magnitude smaller than the E component on all angu- 
lar scales. It is therefore important to make sure that any 
detection of B-type polarization is free from contamina- 
tion from the E component. Such contamination can be 
caused by systematic errors, of course p^4l7| . but even 
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in an error-free map one must worry about mixing of E 
and B components in the data analysis process [18|. For 
a map with complete sky coverage, the two components 
can be perfectly separated in the spherical harmonic do- 
main, but for a map with only partial sky coverage care 
must be taken to separate components in a leakagc-frcc 
way [TjlH. 

One way to think about this concern is in the language 
of the "pure" and "ambiguous" components of a polar- 
ization map [2l|. A "pure E" (resp. B) mode on the 
incomplete sky is one that can only have been produced 
by i?-type (resp. -B-type) polarization, while an ambigu- 
ous mode is one that could have been produced by either 
component. To be specific, a (not necessarily pure) E 
(resp. B) mode P is one that satisfies a certain differen- 
tial relationship • P = (resp. • P = 0). Explicit 
forms for the differential operators T>e.b are given in Sec- 
tion QH and further details may be found in refs. [20ll2lj]. 
An ambiguous mode is one that satisfies both conditions 
simultaneously. There are no modes satisfying both con- 
ditions on the complete sphere, but there on any domain 
consisting of only part of the sphere. 1 

"Pure" E (resp. B) modes are defined to be orthogo- 
nal to all B (resp. E) modes, including the ambiguous 
modes. Explicit examples of pure and ambiguous modes 
may be found in ref. |21| . 

On the incomplete sky, the decomposition into E and 
B components is not unique, since there is no way to 
decide where to put the ambiguous modes. One way to 
illustrate this is to imagine working with a square patch 
of sky in the flat-sky approximation. The E-B decom- 
position is trivial in Fourier space, so one could perform 
the decomposition by Fourier transforming, decompos- 
ing, and transforming back. Now imagine performing 



The analogy between spin-two polarization fields and spin-one 
vector fields is helpful here. On a two-dimensional surface with- 
out boundary, such as a sphere, any smooth vector field can be 
uniquely decomposed into curl-free and divergence free compo- 
nents. On a surface with boundary, however, the decomposition 
is not unique due to the existence of "ambiguous" vector fields 
that are both curl-free and divergence-free, such as a;x — yy, de- 
fined over a subset of the plane. 
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this set of operations after padding the observed map 
out to a larger size, inserting any values you like in the 
unobserved region. Different E/B decompositions will 
result depending on how the padding is performed, all 
of which will match the data over the observed region. 
If all one has access to is the observed patch, there is 
no way to tell which if any of these is the "real" E/B 
decomposition. 

Although the E/B decomposition is not unique for a 
partial sky map, the decomposition into pure E, pure B, 
and ambiguous components is unique [21] . In practice, 
if the E component dominates over the B component as 
expected, the ambiguous modes will mostly contain infor- 
mation about E modes, and a robust detection of B-type 
polarization must therefore be sought in the pure B com- 
ponent. One consequence of this loss of B information to 
ambiguous modes is a shift in the optimum tradeoff be- 
tween sensitivity and sky coverage [18| : the optimum sky 
coverage for a i?-type experiment is larger than would be 
found by a straightforward "Knox formula" [23 |. 

The information loss due to ambiguous modes has two 
sources: incomplete sky coverage and pixelization [2lj |. 
Pixelization ambiguity is essentially a consequence of 
aliasing of Fourier modes (working in the flat-sky ap- 
proximation for simplicity). In a map with pixel size 
Lpix, Fourier modes with wavevector components greater 
than the Nyquist frequency &N y = 27r/L p i X are aliased 
to modes of lower frequency. In the process, the wave 
vector k is mapped to a new vector with, generically, 
a completely different direction. The decomposition of a 
Fourier mode into E and B components depends entirely 
on the direction of k, so aliasing thoroughly scrambles E 
and B modes. There is only one way to avoid this: one 
must pixelizc finely, pushing the Nyquist frequency to a 
level where beam suppression makes aliasing negligible. 

If a data set has been pixelized too coarsely, resulting in 
significant aliasing, no data analysis method can undo the 
E-B mixing. Therefore, in this paper, I will assume that 
the data have been sufficiently finely pixelized (relative 
to the beam size) to control pixclization-induccd E-B 
mixing to an acceptable level. The method described 
in this paper is aimed at removing the incomplete-sky- 
induced ambiguous component. 

Of course, no matter how fine the pixelization, the 
noise will not be smooth on the pixel scale. Separate 
tests are therefore required to make sure that the decom- 
position described herein is well-behaved with respect to 
noise in the data. Section ITVl discusses such tests. 

The original method for decomposition into pure E, 
pure B, and ambiguous components (hereinafter referred 
to as an E/B/ A decomposition) involved the construction 
of an orthornormal basis of such modes by solution of an 
eigenvector problem of dimension 27V p i X , the number of 
pixels in the data set. Such an operation is computation- 
ally extremely expensive for large data sets. This paper 
will present an alternative algorithm that is far more ef- 
ficient. 

Even on an incomplete sky, it is easy to separate a po- 



larization map into E and B components, if one does not 
worry about the purity of those components: one sim- 
ply performs the operation in Fourier space (if the flat- 
sky approximation is appropriate) or spherical harmonic 
space; in either case the separation can be done mode 
by mode. The difficulty comes in projecting out the am- 
biguous component from each of these components. The 
ambiguous component of each map is determined by data 
on the boundary of the map, so it is natural to seek meth- 
ods of finding it and projecting it out by examining only 
pixels near the boundary of the observed region. This 
paper will present one such method. As we will see, such 
methods can be far more efficient than the naive method 
of finding a complete set of normal modes. 

In principle, in order to perform power spectrum es- 
timation (the primary goal of a CMB polarization ex- 
periment), it is not necessary to perform any E/B/A (or 
indeed E/B) separation at all. For any given choice of E 
and B power spectra, one can in principle compute the 
likelihood function L(C Z B ,C Z B ) and use it to draw con- 
fidence intervals or Bayesian credible regions in power 
spectrum space. If this analysis excludes Cf = 0, then 
B-type power has been detected. For large data sets, 
where the full likelihood is too expensive to compute, 
other methods such as the pseudo-C/ method have been 
generalized from temperature anisotropy to polarization 
data [2514291 ] . Such methods achieve near-optimal power 
spectrum estimates without the need to perform an ex- 
plicit E/B/A decomposition. There may well be other 
data analysis methods that do not involve worrying about 
an E/B/A decomposition. For example, one might ana- 
lyz e interferometric data entirely in visibility space (30l — 
|32| , without ever constructing a real-space map at all. 

Nonetheless, E/B/A separation of polarization data 
sets will be useful for several reasons. Although the 
power spectra are the primary quantities to be measured 
in a CMB data set, they are not the end of the story; 
real-space maps are necessary for a variety of applica- 
tions. Probably the most important will be tests for 
foreground contamination, which may be easier to do via 
real-space cross-correlation with foreground templates. 
Use of the lensing i3-mode signal to constrain cosmo- 
logical parameters (e.g., |33l43a |) depends on real-space 
JS-component information, rather than just the power 
spectra. Searches for non-Gaussianity and departures 
from statistical isotropy also go beyond the power spec- 
trum. A real-space picture of the pure B component will 
be an important "sanity check" to make sure that the 
detected B component looks qualitatively as expected. 
Last but certainly not least, people (both scientists and 
the broader public) will find it much easier to believe that 
B modes have really been detected if there is an actual 
map they can be shown. 

Other methods have been proposed for performing E- 
B separation on incomplete sky maps [36l439( |. These 
methods may prove extremely useful, but the method 
I propose herein differs from them in significant ways: 
it allows reconstruction of the actual polarization map 
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components (i.e., the observables Q,U) as opposed to a 
scalar derivative of these quantities, and it does so by 
solving the relevant differential equation for the ambigu- 
ous modes, rather than by making a heuristic approx- 
imation that the ambiguous modes can be regarded as 
confined to the boundary of the map. 

The remainder of this paper is organized as follows. 
Section HQ reviews the formalism behind pure and am- 
biguous components and lays out the schematic recipe 
for the E/B/A separation. Section UTT1 describes the tech- 
nical details of the implementation. Section HVl describes 
some tests of the algorithm, and Section [V] contains a 
brief discussion. 



II. SEPARATION INTO PURE AND 
AMBIGUOUS MODES 

A. Review of formalism 

We begin with a review of some useful relations involv- 
ing E and B modes. The reader wishing further detail 
can see, e.g., refs. [13, [H|. For simplicity, we work in the 
flat-sky approximation. Section [V] discusses the general- 
ization to the spherical sky. 

Let Q{r) and U{r) be the Stokes parameters as func- 
tions of position f on the sky. We will group them to- 
gether into a vector field 



(1) 



Here and throughout, arrows denote spatial vectors, 
while boldface denotes vectors in more general vector 
spaces. In particular, p is not a vector in position space 
- i.e., it transforms with spin two rather than one. 

For the present we neglect pixelization effects and allow 
ourselves to take derivatives. A polarization field is an E 
mode if it satisfies a second-order differential relation 



• p = 0, 

and is a B mode if it satisfies 

D^-p = 0. 



(2) 



(3) 



In the flat-sky approximation, the differential operators 
in these equations can be written 



D 



B = 



2d x d y 

~2d x d y 
dl-d\ 
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(5) 



These two operators are the spin-2 analogues of the di- 
vergence and curl respectively. They satisfy the following 
useful relations: 



• D B = • D £ = 0, 



Dt D s 



V 4 = (V 2 ) 2 



(6) 
(7) 



When working on the sphere rather than the plane, the 
operators T)e,b take on a more complicated form, and 
the bilaplacian V 4 is replaced by V 2 (V 2 + 2). 

Just as with curl- and divergence-free vector fields, we 
can express E and B modes in terms of potentials: any 
polarization field p e that satisfies the E mode condition 
can be written as the derivative of a potential ip E , and 
similarly for any B mode pb- 



p B = D B ^ B . 



(8) 
(9) 



An ambiguous mode, by definition, is one that simul- 
taneously satisfies the requirements of both E and B 
modes. We can construct such modes by choosing a bi- 
harmonic potential ip, i.e., one with 



V> = 0. 



(10) 



Then both and T)b^ will be ambiguous modes: 

equations ([6]) and ([7]), along with the biharmonicity con- 
dition, imply that both and D^, yield zero when ap- 
plied to these fields. 

A "pure" E mode is defined to be one that is orthogo- 
nal, over the observed region, to all B modes (including 
the ambiguous modes). Since it is an E mode, a pure 
E mode can always be derived from a potential ^e via 
equation ©, and in order to be pure the potential must 
satisfy both Dirchlet and Neumann boundary conditions 
on the boundary dfl of the observed region: 

V>E| 9 n = n- Vip E \ 9n = 0, (II) 

where ft is normal to the boundary. There is a unique bi- 
harmonic function satisfying a given set of Dirichlet and 
Neumann boundary conditions. Subtracting off the bi- 
harmonic function that matches the boundary conditions 
oiipE gives a unique way to "purify" a given E mode. 



B. Schematic recipe for E/B/A decomposition 

Suppose that we have a polarization field p observed 
over a region fi. If we ignore pixelization issues and as- 
sume that we can differentiate, then we can decompose 
the field into pure E, pure B, and ambiguous components 
as follows: 

1. Decompose p into E and B components without 
worrying about purity, specifically by finding a pair 
of potentials if)E,ipB such that 



p = Dsi'E + T> B ^B- 



(12) 



2. Find functions o.e-,olb that are biharmonic, i.e., 



V 4 a x = 



(13) 



(where X is either E or B) and that match the 
potentials on the boundaries: 



a x\da 
n-Va x \ dn 



(14) 
(15) 
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3. "Purify" the potentials by defining 

4>pE = TpE ~ UE, IppB = IpB - OLB- (16) 

4. Apply the differential operators to obtain the pure 
E, pure B, and ambiguous polarization fields: 

PpE = DeV'pE (17) 
PpB = D B -0 P s (18) 
p a = D B a £ + D B tt B . (19) 

For pixelized data, we can follow a similar procedure, 
with appropriately denned differential operators. Of 
course, we then have to test whether the discretization 
has introduced significant errors. 

III. IMPLEMENTATION 



in the Fourier domain, since the E/B decomposition can 
be done mode by mode in Fourier space (Section [HI Bp . 

Next we must find biharmonic functions satisfying the 
required boundary conditions. This can be done, e.g., by 
relaxation methods, but another method appears to be 
more efficient. We seek a function a such that V 4 a = 
within the observed region D, and a satisfies certain 
boundary conditions on dQ. We can identify a set of 
"source" points S outside of the observed region on which 
V 4 a is allowed to be nonzero. By solving a linear sys- 
tem, we find the values of V 4 a on S such that, when the 
inverse bilaplacian operator V -4 is applied, the resulting 
a satisfies the required boundary conditions. The inverse 
bilaplacian is a simple convolution with a fixed kernel and 
is efficient to apply in the Fourier domain. Section 1111 CI 
supplies details. 

This procedure completes step 3 in our recipe, and step 
4 is then trivially implemented in the Fourier domain, as 
described in the very brief Section IIIIDI 



The procedure described in the previous section is 
straightforward in principle, but to implement it numer- 
ically on a discretized grid requires some care. In this 
section I will describe this procedure in more detail. I 
begin with a summary of the key points. The reader 
with sufficient patience can then proceed to examine the 
technical details in the rest of the section. 

Before embarking on the above recipe, we need to em- 
bed our data in a rectangular grid, regarded as satis- 
fying periodic boundary conditions, suitable for discrete 
Fourier transforms. We will then apply the various dif- 
ferential operators in the Fourier domain. 2 Since the ob- 
served data will cover only a portion of the grid, 3 we 
must extend it into the unobserved region. This exten- 
sion must be smooth in order to avoid artifacts near the 
boundaries when we differentiate. 

The Fourier-space differential operators are exact for 
functions that are band limited below the Nyquist fre- 
quency. We assume that the pixelization is fine enough, 
compared to the experimental beam, that the intrinsic 
signal has negligible power above the Nyquist frequency. 
The smooth extension must be designed so that the ex- 
tended data are also smooth enough on the pixel scale 
to have power spectra that become negligible well before 
the Nyquist scale. I describe one method for smoothly 
extending the data in Section fill Al 

Once the data have been extended, we can begin the 
decomposition. Step 1 is straightforward to implement 



2 One can instead work with discretizations of the differential oper- 
ators involving nearest neighbors [39| . Such methods are inexact 
even when applied to low-frequency Fourier modes and lead to 
non-negligible E/B mixing even for modes significantly below the 
Nyquist frequency. 

3 Even if the data lie on a rectangular grid, it is necessary to pad 
it out to a larger grid to avoid artifacts due to periodic boundary 
conditions. 



A. Smooth extension of data 

Suppose that we have a data set p b s defined on a 
subset of the pixels in our grid. We need to extend it 
smoothly to a field p that covers the entire grid and 
matches p bs where data exist. 

There are no doubt many ways to achieve this goal. 
One choice would be to create a realization of a Gaus- 
sian random process, constrained to match the observed 
data. Smoothness of the extension can then be enforced 
by assigning a steeply declining power spectrum to the 
random process. 

There is a well-established process [4(| 5l[ for gener- 
ating constrained realizations of Gaussian random fields. 
The process involves two steps: first, the mean field of 
the constrained random process is found, and then ran- 
dom fluctuations are generated about that mean field. 
For our present purposes, we can simply stop after the 
first step: the mean field matches the constraints and 
is in fact smoother than the typical realization, so it is 
better for our purpose. 

We now describe this process in detail. Consider a 
Gaussian random process that generates a polarization 
map g = (Q, U), characterized by E and B mode power 
spectra Pb(/c), Pb(/c). Suppose that the values of g are 
constrained to match p b s over the observed region, or 
at least over a subset of the observed region lying near 
the boundary. In practice, it is sufficient to choose to 
impose the constraint only on a band of pixels consisting 
of the Te X t nearest neighbors of boundary pixels, for some 
reasonably small T oxt . We wish to determine the most 
probable realization of g on the rest of the pixels. 

It is important to emphasize that the resulting g is 
not supposed to be the "real" polarization map extrapo- 
lated into the unobserved region. It is merely an artificial 
field designed solely to join smoothly onto the observed 
field. The power spectra Pe,Pb do not have to match 



FIG. 1: Illustration of maximum-likelihood extension. The upper left panel shows a simulated data set, observed over a square 
region with two holes in it. The direction of polarization is indicated by hue and the polarization amplitude by value in the 
HSV color system. The disc in the lower left illustrates this scheme, with polarization amplitude increasing with distance from 
the center and direction varying with angle. The data were simulated with power spectrum Pb(A:) oc k~~ 2 with no B modes, 
and smoothed with a Gaussian beam of width a = 4 pixels. The right panel shows the mean-field Gaussian extension of the 
data. The extension was calculated using a power spectrum P(k) oc fc -4 , with Pb = O.OIPe, and constrained to match the 
data over a 3-pixel boundary region. 



those of the real data; in fact, it is better if they are very 
steeply declining, so that the Gaussian random process 
will strongly favor smooth functions. 

Let the number of pixels in the entire grid be N p i x 
and the number of constraint pixels be N cons . Imagine 
writing the values of the Gaussian random field over the 
entire grid in a 2./V p i X -dimensional vector G. Some subset 
of these points are the 2N cons constraint values. These 
are listed in a 2A r cons -dimensional vector C. 

Let 

En = (GiGj) (20) 

be the correlation between any two elements of the (un- 
constrained) Gaussian random field. We use these values 
to define two matrices: H cc is the 2iV cons x 2iV cons dimen- 
sional matrix giving the correlations between constraint 
points, and E 9C is the 2iV p j x x 2iV cons dimensional matrix 
giving the correlations between arbitrary points and con- 
straint points. Then the mean field, or most probable, 
value of G is 

G = H 9C • (H cc ) _1 • C. (21) 

Although this procedure sounds cumbersome, it is 
quite simple to apply. The covariance matrix elements 
Sjj are related in simple ways to the Fourier transforms 
of the power spectra Pe,Pb- Moreover, they depend 
only on the vector separation between two pixels, which 
means that the multiplication by the matrix a gc in equa- 
tion (|2ip is a convolution that can be done in time 



iVpj x ln iVpj x . Calculation of the vector (H cc ) _1 • C, on 
the other hand, requires time 0(N^ ons ). For a reason- 
ably convex map with a "nice" boundary, the number 
of boundary pixels is of order the square root of the to- 
tal number of pixels, in which case this is equivalent to 

o(nH 2 s ). 

Note that the computation of the H matrices, as well 
as the Cholesky decomposition required on H cc depend 
only on the geometry of the system and are thus precom- 
putable. 

Figure [1] shows an example of this process. The "ob- 
served" region is a 150 x 150 pixel patch with two circular 
holes 10 pixels in radius. The data consists of an E-type 
polarization field with a power spectrum Pg(fc) oc k~ 2 . 
(The simulation was made on a much larger 1200 x 1200 
grid and truncated, so that the observed grid does not 
satisfy periodic boundary conditions.) The observed re- 
gion has been extended to a 256 x 256 grid, including 
filling in the holes, following the procedure above. The 
extension was constrained to match the observed data 
over a boundary layer 3 pixels thick. The Gaussian ran- 
dom process assumed for the extension had both Pe and 
Pb proportional to fc -4 and Pg = .OlPg. 

The extension smoothly joins onto the observed region 
and satisfies periodic boundary conditions, so that any 
further computations involving discrete Fourier trans- 
forms will be free of artifacts from discontinuities. Note 
that the extension becomes featureless far from the ob- 
served region. This simply reflects the fact that, when 
the constraints are unimportant, the most probable value 
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FIG. 2: Numerical instability in extending maps. The original map (left panel) is a single B-type Fourier mode. It was 
extended using the mean-field method, with results shown in the center and right panels. The center panel shows the result of 
the extension when the Gaussian random process had no B power. The result is an attempt to match the input B mode using 
only E modes, leading to numerical instability: the maximum polarization level in the extended region is 590 times that of the 
original map, so that the original map (lower left quadrant) is invisible. The right panel shows the result obtained when Pb is 
taken to be O.OIPe. In this case, the peak polarization level is 4.3 times that in the extended region. 



of a Gaussian random process is uniformity. 

Some choices must be made in applying this method, 
namely the thickness of the boundary layer on which to 
force matching, and the power spectra for both E and 
B modes in the Gaussian random process assumed for 
the extension. The choice of boundary layer thickness 
is governed by the desire for computational efficiency: if 
it does not lead to excessive computation time, there is 
no reason not to constrain on the entire observed region. 
However, since the purpose of the constraint is simply to 
insure smoothness of the extension, this is not necessary: 
numerical tests indicate that a boundary layer of a few 
pixels is sufficient. 

The choice of power spectrum does not seem to make 
very much difference either, as long as it is a strongly de- 
creasing function of k. I adopt a power-law power spec- 
trum P(k) oc fc~™ s , typically with n s = 4. The method 
can be applied with different power spectra for E and B 
modes. I recommend setting Pe to be much larger than 
Pb, so that the extension will be nearly all E modes. 
This has the result that artifacts resulting from the ex- 
tension will infect the pure E map more than the pure 
B map and thus do less damage. However, Pb should 
not be taken to be zero: if the data set contains actual 
B modes, trying to satisfying the constraints with only 
E modes can lead to numerical instability as shown in 
Figure 2. 

Section HVl discusses some numerical tests on these pa- 
rameters, showing that the final pure and ambiguous 
maps are insensitive to the choices made over a broad 
range. 



B. Initial decomposition 

After generating a smooth extension of the data onto a 
rectangular grid with periodic boundary conditions, de- 



composing the data into (impure) E and B components 
is trivial. We take discrete Fourier transforms of both Q 
and U and perform the decomposition mode by mode in 
the Fourier plane. For a given vector k, making an angle 
a with the x axis, an E mode would satisfy 

< 22) 

for some (fc-dependent) scalar E, and similarly, a B mode 
would satify 

We add these expressions, set the sum equal to (Q,U), 
and solve: 

E = Qcos2a + [7 sin 2a, (24) 
B = -Q sin 2a + Ucos 2a. (25) 

The fields E and B are the laplacians of the correspond- 
ing potentials ipE, ipB, so in Fourier space, 

Tp E = -k~ 2 E, ip B = -k- 2 B. (26) 

The monopole (k = 0) mode cannot be treated in this 
way, of course; it must be removed from the map and 
treated as an ambiguous mode. In addition, modes for 
which either k x or k y equals the Nyquist frequency must 
be treated with care: for such modes, we do not know the 
sign of one component of k, and hence do not know the 
quadrant of a. This means that the terms proportional 
to sin 2a in equations and ([23]) have unknown sign. 
In the numerical implementation of this step, I separate 
these pieces and place them along with the monopole in 
the ambiguous component at the end of the process. Of 
course, the philosophy underlying the entire approach in 
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this paper is that the Nyquist length is well below the 
smoothing scale, in which case this is always a minor 
consideration. 



C. Solving the bilaplacian equation 

We now consider the "purification" of ipE and tpB- As 
we have seen, this step requires finding a biharmonic 
function a for each potential ip whose value and first 
derivative match the potential on the boundary. For ef- 
ficiency's sake, we wish to avoid method that require so- 
lution of linear systems of order JV p i x dimensions. 

One efficient method is to find f3 = V 4 a and then apply 
the inverse bilaplacian operator. We know that /3 = in 
the observed region, but it can (and indeed must) be 
nonzero for some unobserved pixels. We choose a set of 
N SIC "source pixels," all lying in the unobserved region, 
and allow /3 to be nonzero only on these pixels. The 
boundary conditions are then a collection of 27Vbdy linear 
equations to be solved for N STC unknowns. Generically, 
we expect solutions to exist if N SIC > 2./Vbdy- 

Naturally, the solution we find will depend on the 
choice of source points, and even for a given set of source 
points, there will typically be multiple solutions. We 
might hope that all such solutions would coincide within 
the observed region, because of the uniqueness theo- 
rem for biharmonic functions. Unfortunately, this is not 
the case: the uniqueness theorem applies to functions 
whose boundary values and derivatives are specified at 
all boundary points, but we are imposing only a finite, 
discrete set of boundary conditions at the pixel locations. 
We must hope, therefore, that a judicious choice of source 
points and of solution to the linear system will yield a 
good approximation to the "correct" solution that we 
seek. In addition, of course, we must adopt criteria to 
judge whether we have succeeded. 

It is plausible that the best choice of source points 
would be those lying near the boundary of the observed 
region. One way to see this is to note that the relation be- 
tween /3 and a is simply convolution with a fixed kernel: 
a = V~ 4 /3 in real space, so that a — fc~ 4 /3 in Fourier 
space. The real-space convolution kernel is shown in 
Figure[3] The kernel peaks at the source point and decays 
gradually. If we try to satisfy the boundary conditions 
using source points that are far away from the bound- 
ary, the sources will have to be quite large. Moreover, 
since each faraway source point will populate all of the 
boundary points to comparable levels, delicate cancella- 
tions of source points will be required. However, source 
points lying near the boundary will primarily populate 
their neighboring boundary points, which might plausi- 
bly lead to a more stable numerical system. 




FIG. 3: Convolution kernel for the inverse bilaplacian. This 
graph shows the function a — V~ 4 ,3 in the case where f3 is 
nonzero at only one pixel, lying at the center of the figure. For 
arbitrary /3, a is obtained by convolution with this kernel. 



Based on this heuristic reasoning, I chose source points 
to lie in a boundary layer around the observed region, 
with the thickness T src of the layer a free parameter. 
As the tests in the next section will show, this choice 
worked well. The primary errors arose in pixels right 
at the boundary. I therefore generalized the approach to 
consider source pixels that were offset from the boundary, 
i.e., those whose distance d (in pixels) from the bound- 
ary lay in the interval A src < d < A S1C + T src , where 
the offset A src and the thickness T src are both adjustable 
parameters. 

The equations to be solved are in general underdeter- 
mined. One way to choose a solution in this case is to 
solve the system via singular value decomposition, which 
leads to the minimum-norm solution to a linear system. 
As we will see in the next section, a better solution is of- 
ten obtained by retaining only some of the singular values 
in solving the system. The number of values to retain, 
-/Vging, is thus an additional adjustable parameter. 

For a given choice of source points, the solution is 
straightforward to implement. The convolution kernel 
K for the inverse bilaplacian is found in the pixel do- 
main in 0(iVpi X In A^pix) time. If we then represent the 
linear system to be solved as a 27Vbdy x N svc matrix M, 
then the matrix elements for the first iVbdy rows (corre- 
sponding to the Dirichlet boundary conditions) are of the 
form Mij = K(pi — qj), where pi is the location of the zth 
boundary pixel and q*j is the location of the jth source 
pixel. We can similarly compute the vector- valued kernel 
Kg for the gradient of the inverse bilaplacian, V(V -4 ). 
If hi represents a unit normal to the boundary at the 
iih boundary pixel, 5 then the lower half of the matrix M 



4 We take the operator V 4 to have no monopole: a(0) = 0. 



5 One way to define this vector is to compute the derivative of 
the mask, which consists of 1 for observed pixels and for un- 
observed pixels. This derivative, evaluated at boundary pixels, 
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Extension 
to unobserved 
region 


Power spectrum index 
B to E power ratio 
Boundary thickness (pixels) 


n s — 4 
P B /P E = 0.01 
T ext = 3 


Solving for 
biharmonic 
functions 


Source layer thickness (pixels) 
Source layer offset (pixels) 
Singular values retained 


T STC = 10 
A src = 1 
7V sing = 600 



TABLE I: Fiducial values for adjustable parameters in the 
E/B/A decomposition. The first three are used in extending 
the data into the unobserved region (Section IIII All , and the 
last three are in solving the boundary-value problem for the 
ambiguous modes (Section IlII C[) . 



(i.e., the rows corresponding to the Neumann boundary 
conditions) consists of elements of the form hi-K g (pi—qj). 

The most time-consuming step is the singular value 
decomposition, requiring 0(Ng dy N SIC ) ~ 0(T src A^ dy ) 
time. Note, though, that this step depends only on the 
pixel geometry and can be precomputed. 

Once the decomposition is performed, solving the lin- 
ear system for any particular data set requires time 
0(A^ src A^bdy)- Using the result to populate a over the 
entire oberved region is done with a convolution, requir- 
ing 0(N p i x lniVpi x ) time. 



D. Constructing the pure and ambiguous maps. 

Once the pure and ambiguous potentials have been 
found, it is straightforward to construct the final polar- 
ization maps by applying equations (|17M19I) in the Fourier 
domain. There is only one minor technical note. In the 
first step, we removed both the monopole and part of the 
Nyquist-frequency modes before passing from the origi- 
nal map to the potentials. We should add these terms 
into the ambiguous component at the end. 



IV. TESTS 

If we had continuously-sampled data and hence could 
take derivatives, the E/B/A decomposition would be 
unique and exact. The numerical method described in 
the previous section reduces to the exact decomposition 
in the limit where the pixelization becomes infinitely fine, 
but for pixelized data it is only approximate. 

The Fourier-based derivative operators are exact for 
band-limited functions but approximate for functions 
with power above the Nyquist frequency. For (noise-free) 



points in the normal direction. Once normalized, it can be used 
for h. In fact, as long as h is not tangent to the boundary, the re- 
sults do not depend strongly on its direction. The reason is that 
the tangential derivative of a is already constrained due to the 
Dirichlet boundary condition, so the derivative in any linearly 
independent direction serves to constrain the normal derivative. 



data that are smoothed with a beam that is significantly 
larger than the pixel size, the intrinsic signal can be re- 
garded as band-limited, to a good approximation. So 
can the Gaussian process that generates the extension. 
As long as the constraints imposed are sufficient to make 
these two maps join together smoothly, we expect, to a 
good approximation, to be able to regard the entire ex- 
tended map as band-limited and hence differentiable in 
Fourier space. 

Even in this case, the decomposition procedure is ap- 
proximate, because the boundary conditions used to find 
the ambiguous modes are imposed only on a discrete set 
of boundary points, not continuously. Heuristically, we 
expect this to pose a problem chiefly on small scales, 
close to the pixel scale. As long as the data (including 
the extension into the unobserved region) have low power 
on scales close to the Nyquist scale, we can expect the 
method described above to work. 

By construction, the pure E (resp. B) component will 
be an E (resp. B) mode - that is, the pure E compo- 
nent will be a (sampled) derivative T>e of a band- limited 
potential, or equivalently it will be a discretization of a 
band-limited polarization field p P E with ■ p p E = 0. 
Similarly, the ambiguous component is by construction 
ambiguous, satisfying both E and B mode conditions. 
If the method fails, therefore, it will do so via a lack of 
purity of the supposedly pure components. This can be 
assessed by checking whether a map initially containing 
only E modes has contamination in the pure B com- 
ponent, and also by checking orthogonality of the three 
components 

The above considerations apply to smooth data. Of 
course, the assumption of smoothness does not apply to 
noise in the data, so the question of how the decomposi- 
tion acts on the noise is a very important one. For both 
signal and noise, the only way to know if the method is 
working is to perform numerical tests. Since the decom- 
position method is linear, we can measure its treatment 
of signal and noise separately. 

The simulated map in Figure [T] (hereinafter referred to 
as the signal map) will be used to illustrate the perfor- 
mance of the method. The top panel of Figure [4] shows 
the result of applying the E/B/A decomposition proce- 
dure to this map. The decomposition method has a num- 
ber of adjustable parameters, with the values listed in 
Table HJ The input map contained only E modes, so the 
pure B component should vanish. Note that the ampli- 
tude of this component has been increased by a factor 
10 simply to make it visible. 

The bottom panel of Figure 0] shows an E/B/A decom- 
position of a map containing pure white noise, with the 
same observation geometry. In this case, one expects E 
and B to be comparable in amplitude, so the B compo- 
nent is not amplified. 

To quantify the algorithm's performance, I assess the 
following: the extent of contamination of the pure B 
mode by E power, the orthogonality of the three compo- 
nents, and the fraction of noise power in the ambiguous 
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FIG. 4: E/B/A decomposition, (a) The top panel shows the decomposition of the simulated map in FigureQ] The pure E, pure 
B, and ambiguous components are shown from left to right. The amplitude of the pure B component has been increased by a 
factor 10 4 . (b) The bottom panel shows the E/B/A decomposition of a pure white noise map. In this case, the B component 
is not increased in amplitude. In both cases, the fiducial parameters in Table U were used. 



mode. The top line of Table [TT1 shows the results of these 
assessments. To quantify the contamination of the pure 
B mode, I list both the maximum and the rms polariza- 
tion amplitudes of the polarization in the pure B map 
in Figure HJa), compared to the rms amplitude of the in- 
put map. Since the contamination is highly concentrated 
near the boundary, the maximum is far larger than the 
rms, although it is still quite small. 

To assess orthogonality of the pure E, pure B, and 
ambiguous components, I compute the total polarization 
power in each of the three components (P = J2 p (Qp+Up) 
over all pixels), as well as for the input map. I then 
compute 

t = J P " + I' B + P : (27, 

y -Mnput 

which equals 1 if the three components are orthogonal 
and exceeds 1 if they are positively correlated. This quan- 
tity is computed for either the signal map or the noise 
map. The noise map provides a more stringent test, due 
to its abundance of high-frequency power. 

Finally, the table lists the rms polarization amplitude 
in the ambiguous component of the noise map, relative 
to the rms input noise power. Once again, one could 
use either the simulated signal or the noise map in this 
assessment. The advantage of using the noise power is 



that we can predict approximately what we expect to 
see from simple considerations. The number of ambigu- 
ous modes below the Nyquist frequency is approximately 
equal to twice the length of the boundary in pixels [21]. 
The total number of modes that can be measured is 
equal to twice the number of pixels. In a white-noise 
map, all orthonormal modes should have equal power, 
so the ratio of ambiguous-mode rms to total rms should 
be y^Nbdy/Npix. The number of boundary pixels in our 
case is A^dy = 932, and the number of observed pixels is 
^Vpix = 19986, so we expect the ratio to be approximately 
0.2. 

The results show low levels of contaminated power and 
near-orthogonality as desired, and a level of ambiguous 
power in the noise map quite close to the theoretical es- 
timate. 

The fiducial values in Table fl] were chosen by trial and 
error to yield good results for these and similar tests, 
although they are certainly not the result of a system- 
atic optimization procedure. On the contrary, increasing 
some parameters (T ex t, ^source in particular) causes the 
test results continue to improve very modestly, at the 
cost of greater computation time. The optimal choice 
of parameters will of course depend on the details of the 
data set to be analyzed and on the tradeoff between com- 
putation time and accuracy. 

Table |TT] and Figure [5] show the results of varying some 
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FIG. 5: Effects of varying decomposition parameters, (a) The power spectrum index n„ used for the extension is reduced from 
4 to 2. (b) The number of source points used in finding biharmonic functions is reduced by setting the parameter T somcc to 2. 
(c) The number of singular values retained in finding biharmonic functions is reduced to 100. Panels (a) and (b) show results 
for the simulated map of Figure [T] with the pure B component enhanced by a factor 10 4 as in Figure [4] Panel (c) is for a white 
noise map, with no B enhancement. 



of these parameters. For example, varying the spectral 
index used in extending the data to n s — 2 causes the 
extension to be less smooth, resulting in high-frequency 
power leaking into the pure B component [Figure EJa)]. 
Reducing the number of source points used to find the bi- 
harmonic functions also results in increased leakage [Fig- 
ure [S^b).] In both cases, recall that the B component 
is increased by 10 4 in amplitude to make it visible: as 
Table HI1 indicates, the actual levels of contamination are 
still quite small. 

The E/B/A decomposition algorithm treats E and B 
identically, except in the adoption of different power spec- 
trum normalizations for E and B modes used in extend- 



ing the data. The line in Table |TT] showing the effect 
of adopting a power spectrum ratio Pb/Pe — 100 illus- 
trates this breaking of symmetry: as expected, there is 
more leakage into the pure B mode when the extension 
is heavily weighted towards B modes. Equivalently, the 
fiducial ratio of 0.01 results in more leakage from B into 
E than vice versa. The fiducial choice Pb/Pe = 0.01 
is based on the assumption that leakage from E into B 
is more of a concern than leakage from B into E. If it 
is deemed important to maintain E/B symmetry in the 
process, however, one can choose Pb/Pe — 1- The re- 
sulting rms leakage in this case is still quite less than one 
part in 10 3 , although the peak contamination, near the 
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Parameters 


■Bmax 




£sig 


^noisc 


a noise 
''rms 


Fiducial 


2.8 x 10" 3 


5.5 x 10" 5 


1.0045 


1.022 


0.20 


Pb/Pe = 


2.5 x 10" 4 


2.1 x 10" 5 


1.0045 


1.21 


0.39 


Pb/Pe = 1 


3.3 x 10" 2 


4.2 x 10" 4 


1.0045 


1.021 


0.20 


Pb/Pe = 100 


6.5 x 10" 2 


1.0 x 10 -3 


1.0045 


1.021 


0.19 


T src — 5 


8.0 x 10~ 3 


1.4 x 10" 4 


1.0045 


1.019 


0.19 


T src — 2 


2.6 x 10~ 2 


7.8 x 10" 4 


1.0060 


1.018 


0.19 


A src = 


7.9 x 10" 3 


1.5 x 10 -4 


1.0047 


1.018 


0.19 


iVsing = 100 


0.10 


9.3 x 10~ 2 


1.020 


1.28 


0.51 


N sing = 1800 


7.5 x 10" 4 


4.4 x 10" 5 


1.018 


1.57 


0.90 



TABLE II: Results of tests of the E/B/A decomposition algo- 
rithm. The quantities firms and _B m ax are the maximum and 
rms contamination of the pure B component, which should 
ideally be zero, as a fraction of the input rms power. The 
quantities £ s ig,£noisc, defined in equation (|27[) . quantify the 
orthogonality of the three components in the signal and noise 
map respectively. Finally, ij™ is the noise rms found in the 
ambiguous component, as a fraction of the input noise rms. 
The first row shows results for the fiducial parameters of Ta- 
ble |U In each subsequent row, one parameter is varied from 
the fiducial values. 



edges, rises to a few percent. 

Some poor parameter choices result in numerical errors 
that can be seen in the failure of the components to be or- 
thogonal and in the excess power going into the ambigu- 
ous mode. In particular, as noted in Section liTI Al forcing 
the B component power spectrum Pg to be identically 
zero in the extension leads to numerical instability as the 
actual B modes are shoehorned into E power. Keeping 
too many singular values in solving for the biharmonic 
functions also leads to numerical instability, causing the 
ambiguous component to contain non-ambiguous modes 
[Figure GOc)]. 



V. DISCUSSION 

The tests in the previous section indicate that the 
methods described in this paper can give a good approx- 
imation to the pure E, pure B, and ambiguous modes of 
a CMB polarization data set. The components are close 
to orthogonal, and there is very low leakage of E modes 
into the pure B component. The leakage that does occur 
is, not surprisingly, close to the boundary of the observed 
region, where effects of boundary discretization are most 
important. 

Figure [6] provides an additional illustration of the 
method, using more realistic input data than the simple 
power-low power spectra in previous examples. An input 
map was created containing both E and B modes, with 
power spectra computed by CAMB 1 42] based on the 
best-fit WMAP ACDM cosmology |43||. with a tensor- 
to-scalar ratio T/S = 0.05. The simulated map is a 
7.5° x 7.5° square, pixelized into 150 x 150 pixels 3' in 



size. The map was smoothed with a a = 12' Gaussian 
beam. As in the previous example, the map was simu- 
lated on a larger 1200 x 1200 grid and truncated, so that 
it would not have periodic boundary conditions. This 
map is sensitive to multipoles 30 < I < 1000. 

The upper panel of Figure |H] shows the E and B com- 
ponents of the input map, as well as their sum. The 
lower panel shows the result of performing the E/B/A 
decomposition on the sum. In both cases, the B com- 
ponent is enhanced by a factor 15 for visibility. The 
recovered E and B components look qualitatively sim- 
ilar to the input components, especially away from the 
edges. The pure E component contains 81% of the input 
E power (Q 2 + U 2 ). The pure B component contains 46% 
of the input B power. The ambiguous component con- 
tains roughly 20% of the total power in the input map. 
As expected, nearly all of this ambiguous power comes 
from the E component of the original map. The three 
components are very close to orthogonal: the quantity £ 
in equation (f2"7| is 1.005. 

The method described in this paper was designed 
to avoid operations involving the solution of N p i x - 
dimensional linear systems. The scaling of the various 
steps in the algorithm with data size is therefore of in- 
terest. Imagine a data set with 7V p i x pixels and a bound- 
ary of length iVbdy pixels. If the data set is reason- 
ably round and has a smooth boundary, then we expect 

-^Vbdy ~ Npix ■ (If tne data are extremely "holey," due, 
e.g., to the removal of many point sources, then A^dy 
might be much larger. One might wish to search for spe- 
cial techniques for treating this particular case of many 
small round holes in the data.) 

The smooth extension of the data involves the solu- 
tion of a linear system of size N cons ~ (Text-^bdy), so the 
scaling of this step is 0{{T cxt N hdy ) 3 ) ~ 0(T e 3 xt ^ p 3 / x 2 ), as- 
suming a "nice" boundary. The most expensive step in 
the solution of this system is a Cholesky decomposition, 
which depends only on the pixel geometry and not on the 
data itself. Thus if multiple maps with the same geome- 
try are to be analyzed (e.g., in Monte Carlo simulations), 
this step can be precomputed. The parts of the smooth 
extension that cannot be precomputed scale at most as 
C(^cons)- (I n an y case, the method I have described for 
smooth extension is hardly unique; it is easy to imagine 
that faster ones can be found.) 

The step involving the solution of the bilaplacian 
equation scales similarly: there is a precomputable sin- 
gular value decomposition scaling as 0(N^ dy N STC ) ~ 

0{T SIC N^ dy ) ~ 0(T src A^ p 3 / x 2 ), again assuming a "nice" 
boundary for the last step. The non-precomputable part 



= ^b 2 dy) 



o(r src Ar pix ) 



of the process scales as 0(T S 

The rest of the process involves Fourier transforms, 
which of course scale as 0(Af plx In Ap; x ). 

I have described and implemented the algorithm in 
the flat-sky approximation for simplicity. Each step in 
the process generalizes in a perfectly natural way to the 
spherical sky, so a spherical implementation should be 
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FIG. 6: Illustration of E/B/A decomposition. The upper panel shows the E component, B component, and the sum of the 
two components for a simulated map based on a ACDM cosmological model with a tensor-scalar ratio T/S = 0.05. The map 
contains 150 x 150 pixels of size 3' and was smoothed with a a = 12' beam. The B component is enhanced by a factor 15 for 
visibility. The lower panel shows the result of the E/B/A decomposition: the pure E, pure B, and ambiguous components are 
shown from left to right. Again, the pure B component is enhanced by a factor 15. 



perfectly possible. In this case, the Fourier transforms 
must be replaced with spherical harmonic transforms. 
The portions of the algorithm that scale as N p - lx In N p i* 
will then scale as (i.e., the same as the HEALPix 

|44| programs synf ast and anaf ast, and still at least as 
fast as the slowest other steps in the algorithm). 

Alternative methods of performing an E/B/A decom- 
position have been proposed [361 - 4381 ] . These methods are 
all potentially useful, but they differ from the one pre- 
sented here in important ways. Some methods (36l . 
involve finding scalar- valued derivatives of the pure com- 
ponents (essentially, B p, but do not yield the actual 
polarization maps (i.e., Stokes Q, U) of the components. 
This has the advantage that the ambiguous contribution 
to the scalar maps is more concentrated at the bound- 
ary, so that approximate purification can be achieved by 
removing data near the edges. However, these methods 
do not allow for purification of the actual observables 
(Q, U), as for these quantities the ambiguous modes per- 
sist far into the interior (Figure 2]). For some purposes, 
one may wish to analyze the pure E and B components 
of the actual polarization map. 

In addition, there is a potentially promising method 
based on a wavelet decomposition [381 ] . It too involves 



removing the ambiguous modes via a hard cutoff near 
the boundary, but the cutoff is set in terms of the scale 
of each wavelet, rather than being a fixed number of pix- 
els. This is a sensible approach, as the distance an am- 
biguous mode persists into the interior of a map depends 
on the frequency of the source function on the boundary. 
In contrast, the method I have described does not make 
any a priori assumption about the ambiguous modes be- 
ing restricted to the proximity of the border. Rather, it 
solves the relevant equation to determine how far from 
the border the ambiguous modes persist. 

In preparing for the analysis of any particular data set, 
it would be extremely interesting to perform simulations 
to compare the performance of the various methods in 
detail. 
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